Keyword [Image Transformation] [geometric matching]

Rocco I, Arandjelovic R, Sivic J. Convolutional neural network architecture for geometric matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 6148-6157.

该篇论文

提出CNN结构mimic传统机器学习中的geometric matching算法 (Figure 1):feature extraction, matching, simultaneous inlier detection, model parameter estimation.
使用合成数据进行训练模型，不需人工标注数据（几何变换参数）

1. 模型结构

(Figure 2)模型包含Feature extraction, Matching network和Regression network.图中两个Feature extraction CNN共享同样的参数，即用同一个网络提取A和B的特征。

2. Feature extraction

Use VGG16 network cropped at the pool4 layer (before the ReLU unit), followed by per-feature L2-normalization.

3. Matching network

采用Correlation map computation with CNN feature (Figure 3). 对于特征图A中的某个空间点，计算特征图B中每个空间点与其的相关性（模拟几何变换机器学习算法）。此外，使用channel-wise normalization和ReLu操作amplify the score of the match.

论文中将该方法与常用的Concatenation和Subtraction方法进行比较(Table 2)，证明该方法效果更好。原因是

后续的Regression Network是由一系列Conv层组成，unable to detect long-range matches.
对于相同几何变换的不同图像pair而言，Concatenation和Subtraction会产生不同的输出，会增加Regression Network的难度。

此外，对correlation map进行normalization能够提升4个百分点。

4. Regression Network

考虑到参数、内存和计算量的问题，Regression Network (Figure 4)采用具有局部感知特性的Conv层，而非FC层。这种方法能够Work是因为对于AB相关性特征图上的某个空间点而言，它包含了B特征图中该点与A特征图中所有空间点的相似性得分，因此虽然使用局部性的Conv，但仍然具有全局性。

5. Hierarchy of transformations

为了得到更精确的结果，论文提出了一个hierarchy模型 (Figure 5)。该模型包含2个stage。

第一阶段estimate 6 parameters的affine transformation.
第二阶段estimate 18 parameters的thin plate spline transformation.

6. Loss Function

$g_i$为uniform grid[-1, 1]，计算变换后的网格之间的距离平方。

7. Dateset

Generate each example by sampling image A from a public image dataset, and generating image B by applying a random transformation $T_θ$GT to image A.